CTRL+Z: Recovering Anonymized Social Graphs
نویسندگان
چکیده
Social graphs derived from online social interactions contain a wealth of information that is nowadays extensively used by both industry and academia. However, due to the sensitivity of information contained in such social graphs, they need to be properly anonymized before release. Most of the graph anonymization techniques that have been proposed to sanitize social graph data rely on the perturbation of the original graph’s structure, more specifically of its edge set. In this paper, we identify a fundamental weakness of these edge-based anonymization mechanisms and exploit it to recover most of the original graph structure. First, we propose a method to quantify an edge’s plausibility in a given graph by relying on graph embedding. Our experiments on three real-life social network datasets under two widely known graph anonymization mechanisms demonstrate that this method can very effectively detect fake edges with AUC values above 0.95 in most cases. Second, by relying on Gaussian mixture models and maximum a posteriori probability estimation, we derive an optimal decision rule to detect whether an edge is fake based on the observed graph data. We further demonstrate that this approach concretely jeopardizes the privacy guarantees provided by the considered graph anonymization mechanisms. To mitigate this vulnerability, we propose a method to generate fake edges as plausible as possible given the graph structure and incorporate it into the existing anonymization mechanisms. Our evaluation demonstrates that the enhanced mechanisms not only decrease the chances of graph recovery (with AUC dropping by up to 35%), but also provide even better graph utility than existing anonymization methods.
منابع مشابه
An Iterative Algorithm for Graph De-anonymization
The availability of social network data is indispensable for numerous types of research. Nevertheless, data owners are often reluctant to release social network data, as the release may reveal the private information of the individuals involved in the data. To address this problem, several techniques have been proposed to anonymize social networks for privacy preserving publications. To evaluat...
متن کاملFinding the Most Appropriate Auxiliary Data for Social Graph Deanonymization
Given only a handful of local structural features about the nodes of an anonymized social graph, how can an adversary select an auxiliary (a.k.a. non-anonymized, known) graph to help him/her deanonymize (a.k.a. re-identify) the individuals in the graph? Examples of local structural features are node’s degree, node’s clustering coe cient, edge density of the node’s neighbors, etc. The objective ...
متن کاملSurvey on Privacy Preserved Methods for Social Networking in Cloud Computing
Now a day companies would publish social networks to the third party for example the cloud service provider (CSP) for marketing reasons. The preserving privacy when publishing the social network data becomes a important issue. In this scenario we identify the novel type of the privacy attack termed 1*-neighborhood attack. In this we assume that theattacker has knowledge about a degree of the ta...
متن کاملLearning to Predict the Presence of Nodes in Anonymized Graphs
Introduction We address the problem of learning presence models of nodes in anonymized graphs. Specifically given two graphs, a “known” graph G1 and an anonymized graph G2, we want to learn a classification model that predicts the presence of nodes in G2. Solving this problem does not require solving the nodecorrespondence problem across graphs. All that it requires is the computation of the pr...
متن کاملAnonymization of Centralized and Distributed Social Networks by Incremental Clustering
The social media has grown very vastly in the earlier years known think for all. There are different social media sites like Facebook, Twitter, LinkedIn, Google+ and many more that holds public and confidential/ personal information about their users. It is mandate to provide security to those users. In social network graphs are anonymized before being published to the others might be third per...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1711.05441 شماره
صفحات -
تاریخ انتشار 2017